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SUMMARY 


In  this  report  an  approach  to  the  concept  of  error  in  utility 
assessment  is  proposed.  Three  components  of  error  are  considered  and 
each  component  is  related  to  four  separate  elicitation  methods  — all  in 
the  context  of  a general  multiplicative  multiattribute  utility  model. 

The  methods  are  a Keeney-Raif fa  (1976)  procedure,  SMART  (Edwards,  1977), 
a ^ocial  J^udgment  ^heory  (SJT)  based  regression  model  (Hammond,  Stewart, 
Brehmer  and  Steinmann,  1975)  and  a new  method  called  Holistic  Orthogonal 
parameter  Estimation  or  HOPE  (Barron  and  Person,  1978). 

If  a general  multiplicative  model  can  be  assumed  to  be  an  appropri- 
ate representation  of  the  decision  maker's  basic  preference  structure, 
error  can  occur  in  the  direct  estimation  of  the  scaling  constants  and 
univariate  utility  functions  for  decomposition  methods  (Keeney-Raiffa 
and  SMART),  or  in  the  holistic  assessments  for  holistic  methods  (SJT  and 
HOPE).  Individual  estimates  may  be  merely  noisy  or  may  be  fundamentally 
incorrect.  Furthermore,  the  utility  model  may  be  incorrectly  specified; 
for  example,  an  additive  model,  rather  than  a multipllcitlve  model,  may 
be  used.  The  four  assessment  methods  are  considered  in  conjunction  with 
errors  of  each  kind. 

The  most  serious  error-method  combination  is  the  case  of  a sub- 
stantial degree  of  error  occurring  in  a single  holistic  judgment  which 
is  being  used  in  a HOPE  procedure.  This  concern  leads  to  a major  em- 
phasis of  this  report  — an  expanded  HOPE  procedure  used  in  conjunction 
with  a convergent  validation  strategy  to  estimate  error  in  individual 
holistic  Judgments  and  thus  guide  consistency  checks. 
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The  discussion  is  organized  into  four  sections.  The  HOPE  procedure 
is  summarized  in  Section  I.  In  Section  II,  three  components  of  assess- 
ment error  are  considered  in  conjunction  with  the  four  elicitation  pro- 
cedures. In  Section  III,  an  expanded  HOPE  procedure  for  detecting  judg- 
ment error  and  guiding  consistency  ch  :ks  is  proposed.  In  Section  IV, 
application  considerations  are  outlined. 
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Introduction 


In  this  report  an  approach  to  the  concept  of  error  In  utility 
assessaent  is  proposed.  Three  components  of  error  are  distinguished; 
these  components  are  then  related  to  four  separate  elicitation  methods 
— each  of  which  Is  consistent  with  at  least  special  cases  of  the  gen- 
eral multiplicative  multiattribute  utility  (MAU)  model  (below).  Two 
methods,  Keeney-Ralf fa  (1976)  and  SMART  (Edwards,  1977)  are  pure  decom- 
position approaches;  a third,  the  social  Judgment  paradigm  (Hammond, 
Stewart,  Brehmer,  and  Steinmann,  1975)  is  a regression  approach  which 
relies  on  holistic  Judgments. 

The  fourth  approach  is  a decomposition  procedure  for  assessing 
multiplicative  MAU  functions  which  relies  solely  on  a few  holistic 
assessments  of  utilities.  The  procedure’s  acronym  is  HOPE  for  Holistic 
Orthogonal  Parameter  Estimation.  Consistent  with  the  procedures  of 
Keeney  and  Raiffa  (1976),  the  HOPE  procedure  exploits  the  basic  pref- 
erences of  the  decision  maker  to  specify  the  utility  function.  HOPE 
differs  in  that  it  uses  a response  mode  more  familiar  to  laymen  than 
chose  of  other  methods  of  MAU  elicitation  — holistic  assessment  of  (a 
few)  profiles  — to  determine  the  scaling  constants  and  univariate 
utility  functions  comprising  the  multiplicative  utility  function. 

The  larger  question  behind  any  analysis  of  error  in  assessed  util- 
ity functions  concerns  validation.  There  are  three  basic  approaches  Co 
validation  of  assessed  utility  functions:  (1)  use  of  an  external 
criterion;  (2)  validating  Che  basic  preference  structure  of  the  decision 
maker;  (3)  convergent  validity.  Of  these,  the  most  straightforward  is 
use  of  an  objective,  externally  defined,  criterion,  one  is  available. 
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Edwards  (1974)  has  suggested  two  Instances  of  available  criteria  — 
diamonds  and  bank  credit.  The  American  Gemologlcal  Institute  "diamond 
model"  formally  evaluates  diamonds  based  on  the  four  attributes:  color, 
cut,  clarity,  and  carats.  Banks  evaluate  applicants  for  credit  cards  on 
the  basis  of  attributes  contained  on  standard  application  forms  (e.g., 
disposable  Income,  own  versus  rent,  debt,  employment  history,  etc.); 
while  actual  probability  and  amount  of  default.  If  any,  are  known 
empirically. 

Edwards  and  his  associates  have  suggested  a variant  of  the  multiple 
cue  probability  learning  paradigm  (Hanmond,  Stewart,  Brehmer  and  Steln- 
raann,  1975)  as  a means  of  creating  an  external  criterion.  Subjects  are 
first  trained  to  use  a weighted  additive  utility  functions;  this  Is 
followed  by  eliciting  learned  utilities.  This  procedure  has  been  pilot- 
tested  at  SSRI. 

The  purpose  of  studies  using  an  external  criterion  is  to  compare 
and  evaluate  the  performance  of  elicitation  techniques.  The  usefulness 
of  such  studies  rests  on  the  assumption  that  comparative  efficacy  of 
elicitation  methods  generalizes  from  situations  where  an  external 
criterion  exists  to  situations  where  one  does  not.  Of  course,  external 
criteria  do  not  generally  exist  In  de  novo  assessment  of  utilities  in 
laboratory  or  field  settings.  Furthermore,  elicitation  Is  unnecessary 
when  such  a criterion  exists.  Thus,  In  application  as  opposed  to  ex- 
perimental settings,  validity  necessarily  depends  on  either  validation 
of  the  basic  preference  ^odel  Itself  or  on  convergent  validity. 
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Validating  Che  basic  preference  model  requires  verifying  qualitative 
properties  of  preferences  which  are  sufficient  for  representing  prefer- 
ences by  a particular  mathematical  model.  In  Che  decomposition  approach 
of  Keeney  and  Raiffa  (1976),  the  qualitative  properties  of  preferential 
Independence  and  utility  Independence  are  explicitly  examined.  Prefer- 
ential independence  requires  Chat  preference  orders  over  attribute  pairs 
be  independent  of  the  fixed  levels  of  Che  remaining  attributes.  Utility 
independence  requires  Chat  preferences  for  lotteries  involving  a single 
attribute  be  independent  of  Che  fixed  levels  of  Che  remaining  attri- 
butes. If  these  qualitative  properties  of  preferences  hold,  Chen  Che 
preference  structure  can  be  represented  by  either  a multiplicative 
model,  equation  (1),  or  by  an  additive  model,  equation  (2). 

n 

1 + KU(x, ,x-,  . . . , X ) ■ n (1  + Kk  u. (x.))  (1) 

12  " 1-1  ill 

where,  for  1 ■ 1,  2,  ...,  n 

x^  is  level  i of  attribute 
0 £ Uj(*)  J is  a utility  function 
0 < < 1 is  a scaling  constant 

-1  < K is  a parameter 
U(*)  is  overall  utility. 

Upon  writing  equation  (1)  simply  as  an  expression  for  U(*)  and  Caking 
limits  as  K goes  Co  0,  the  resulting  model  is  the  additive  model,  equa- 
tion (2): 

n 

U(Xj^,  Xj,  ....  x^)  - I kj^u^(Xj^)  (2) 
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The  basic  preference  model  Is  validated  by  checking  the  conditions  of 
preferential  and  utility  Independence.  For  an  especially  informative 
assessment  protocol  in  which  the  conditions  are  checked,  see  Keeney 
(1977). 

The  choice  between  models  1 and  2 in  an  application  setting  is  not 

necessarily  a simple  choice.  The  choice  can  be  viewed  as  an  empirical 

n 

question,  in  which  the  parameter  K ■ 0 if  and  only  if  ^ k.  ■ 1.  The 

i-1  ^ 

choice  can  be  based  behaviorally  on  the  marginallcy  condition  (Flshburn, 
1965).  Model  2 is  appropriate  only  if  the  decision  maker's  preferences 
for  multiattrlbuted  risk  consequences  depend  only  on  the  marginal  dis- 
tribution associated  with  each  attribute.  Experimental  studies  (e.g., 
both  conducted  and  surveyed  by  von  Winterfeldt  (1976))  have  often  found 
violations  of  the  marginality  condition.  Finally,  the  choice  can  be 
made  on  more  pragsuitic  grounds.  Additive  models  are  simpler,  provide 
excellent  predictions,  and  can  be  used  in  conjunction  with  simpler 
procedures  (e.g.,  the  SMART  procedure  discussed  in  this  report).  In 
fact,  referring  to  the  same  experimental  studies  von  Winterfeldt  states: 

The  message  that  these  experiments  convey  seems  contradictory: 
in  spite  of  obvious  model  violations  (tests  of  marginality  and 
tests  of  risky  additivity  failed),  additive  models  . . . predict 
subjects'  preferences  and  utility  Judgments  very  well  (p.  24). 

Convergent  validation  assumes  that  logically  equivalent  elicitation 
procedures  will  assign  comparable  utilities  to  the  same  multi-attributed 
outcome.  In  his  convergent  validation  study  Fischer  (1977)  elicited 
utilities  for  27  hypothetical  Jobs  from  each  of  10  subjects  in  three 
different  ways:  (1)  direct  holistic  assessments,  (2)  via  a Keeney- 
Ralffa  decosqjosition,  and  (3)  via  the  SMART  procedure.  The  utilities 


predicted  for  Che  27  jobs  (within  subject)  by  the  Keeney-Ralf fa  anJ 
SMART  procedures  exhibited  high  correlations  with  each  other  and  were 
highly  correlated  with  Che  holistic  assessments. 

Slovlc,  Flschhoff  and  Lichtenstein  (1977)  have  criticized  using 
correlation  between  predictions  of  MAU  models  and  holistic  judgments  as 
evidence  Chat  the  MAU  model  Is  valid.  Their  criticism  Is  based  on  Che 
use  of  unaided  holistic  preference  as  a criterion.  They  further  state, 

".  . .a  decade  or  more  of  research  has  abundantly  documented  Chat 
humans  are  quite  bad  at  making  complex  unaided  decisions  (Slovlc  [1972]); 
It  could  thus  be  argued  that  high  correlations  with  such  flawed  judg- 
ments would  suggest  a lack  of  validity"  [p.  22]. 

At  face  value  Che  remarks  of  Slovlc  et  al.  are  damaging  to  the  HOPE 
procedure  described  In  this  report.  In  fact,  a major  emphasis  of  this 
report  Is  Che  use  of  an  expanded  HOPE  procedure  in  conjunction  with  a 
convergent  validation  strategy  to  estimate  prediction  (of  utilities) 
error,  and  to  thus  Identify  outliers  In  the  set  of  holistic  judgments. 
Thus,  Che  view  represented  by  the  remarks  of  Slovlc  et  al.  must  be 
considered. 

I believe  decision  makers  can  provide  useful  holistic  assessments 
of  (a  few)  multlatCrlbuted  consequences,  especially  in  cases  involving  a 
small  number  of  attributes,  say  fewer  Chan  ten.  Furthermore,  there  Is 
considerable  evidence  to  support  my  belief.  Two  research  traditions, 
heavily  based  on  holistic  assessment,  are  social  judgment  theory  (Hammond 
et  al.)  and  Information  Integration  theory  (Anderson  [1974]).  Numerous 


empirical  studies  In  these  traditions  alone  constitute  examples  of 


Judges  providing  holistic  judgaents.  Of  course,  neither  the  cognitive 
processes  which  produce  holistic  judgments  are  well  understood,  nor  are 
the  exact  parameters  of  such  Judgments  known.  It  is  generally  accepted 
that  such  Judgments  are  often  systematic,  but  noisy  (Fischer,  1977);  in 
fact,  bootstrapping  (Dawes  and  Corrigan,  1974)  capitalizes  on  these 
features.  It  is  also  presiased  that  noise  increases  as  the  number  of 
attributes  Increases.  Finally,  it  is  plausible  to  presume  that  noise 
increases  as  the  number  of  required  holistic  evaluations  increases. 

The  HOPE  procedure  requires  dix*ect  holistic  assessment  of  the 
utility  of  each  of  a small  number  of  mulci.^ttributed  consequences  com- 
prising a highly  fractionated  experimental  design.  By  small  number,  I 
mean  fewer  than  fifty  consequences,  a number  suggested  by  both  Keeney 
and  Half  fa  (1976,  p.  222)  and  Fischer  (1977).  My  basic  assumption  is 
that  decision  makers  can  provide  the  requisite  holistic  assessments; 
furthermore  I recognize  that  these  assessments  will  be  noisy.  Moderate 
noise,  per  se,  does  not  materially  reduce  the  efficacy  of  the  HOPE 
procedure.  A previous  paper  (Barron  and  Person,  1978)  demonstrated  that 
the  HOPE  procedure  could  recover  known  MAU  functions  from  simulated 
noisy  holistic  Judgments.  Recovery  was  excellent  in  the  presence  of 
moderate  amounts  of  noise  — defined  as  nonrally  distributed  additive 
error  having  a standard  deviation  of  .05.  Model  specification  error  — 
defined  as  the  use  of  an  additive  model  fitted  to  error-free  holistic 
Judgawnts  computed  from  known  multiplicative  models  — produced  higher 
prediction  error  than  did  axHlerate  noise  in  conjunction  with  an  appro- 
priate multiplicative  model  using  an  estimated  R value. 
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The  HOPE  procedure  generalizes  a procedure  for  estimating  main 
effects  in  an  additive  model  (Green,  1971),  to  the  assessment  of  the 
multiplicative  utility  functions  of  Keeney  (1974).  Multiplicative  (and 
additive)  utility  functions  are  of  Interest  for  three  reasons.  First, 
the  multiplicative  model  represents  the  decision  maker's  preference 
structure  if  certain  preferential  and  utility  Independence  conditions 
are  satisfied.  In  those  cases  in  which  these  conditions  have  been 
verified,  the  multiplicative  model  is  valid.  Second,  the  multiplicative 
utility  function  is  of  practical  importance,  even  if  the  requisite 
assumptions  do  not  hold  precisely  (Keeney  and  Raiffa,  1976,  p.  298). 
Third,  formally  the  general  r.ultlpllcative  model  encompasses  the  kinds 
of  utility  functions  resulting  from  each  of  the  four  assessment  ap- 
proaches considered  in  this  paper. 

This  report  is  organized  into  four  sections.  The  HOPE  elicitation 
procedure  is  described  in  section  I.  In  section  II,  three  components  of 
utility  assessment  error  are  considered  in  conjunction  with  the  four 
elicitation  procedures.  In  section  III,  a method  for  detection  of  judg- 
ment error  is  proposed  and  illustrated  by  application  to  the  data  of 
Fischer  (1977).  In  section  IV,  application  considerations  are  outlined. 

I.  HOPE;  A Utility  Elicitation  Procedure^ 

* 

The  HOPE  and  Keeney-Raif fa  procedures  differ  only  in  the  parameter 
estimation  phase.  HOPE  estimates  the  parameters  — the  univariate 

+ 

Section  I is  a revised  version  of  section  I of  Barron  and  Person 
(1978). 

*Unless  noted  otherwise,  by  "a  Keeney-Raif fa  procedure"  is  meant 
their  general  approach  to  assessing  a multiplicative  utility  function  as 
described  in  Keeney  and  Raiffa,  1976,  pp.  297-304. 


utlllcy  functions  and  scaling  constants  — of  the  multiplicative  family 
of  utility  functions  from  holistic  Judgments  of  utility.  The  estimates 
are  derived;  they  are  Inferences  which  are  completely  based  on  the 
appropriateness  of  the  underlying  preference  structure.  A Keeney-Ralffa 
procedure,  while  relying  on  the  appropriateness  of  the  same  underlying 
preference  structure,  differs,  primarily  In  that  scaling  constants  and 
utility  functions  are  Individually  assessed. 

The  HOPE  procedure  consists  of  three  phases:  (1)  preparation, 
meaning  those  aspects  common  to  both  HOPE  and  Keeney-Ralffa  approaches 
(necessary  preliminaries.  Identification  and  definition  of  attributes, 
determination  of  value  ranges,  verification  of  appropriate  Independence 
conditions);  (2)  eliciting  direct  holistic  assessments  for  the  specific 
multlattrlbuted  consequences  of  an  appropriate  orthogonal  design;  (3) 
performing  the  arithmetic  on  the  holistic  assessments  required  to  deduce 
scaling  constants  and  utility  functions. 

Preparation 

All  approaches  to  multiattribute  utility  assessment  Involve  certain 
necessary  preliminaries.  Clearly  the  stage  must  be  set,  the  respondent 
must  realize  the  purpose  of  the  exercise,  and  mutual  understanding  suf- 
ficient for  effective  comsninlcatlon  snist  be  established.  Following  pre- 
liminaries, both  approaches  would  identify  and  define  value-relevant 
attributes.  For  each  attribute,  both  wuld  determine  an  appropriate 
range  of  values.  Including  which  level  Is  worst  and  tdilch  Is  best. 
Finally,  both  would  check  to  see  that  necessary  preferential  independence 
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and  utility  Independence  assumptions  are  met.  As  a consequence  of  the 
Independence  conditions,  both  would  conclude  that  Keeney's  general 
multiplicative  model,  equation  (1),  Is  an  appropriate  MAU  function.  At 
this  point,  a Keeney-Ralf fa  procedure  differs  from  the  HOPE  procedure  by 
assessing  utility  functions  and  scaling  constants  Individually.  The 
utility  functions,  u^,  are  assessed  via  standard  gambles  In  the  usual 
way  (Keeney  and  Ralffa,  Chap.  4).  The  parameter  is  Interpreted  and 
often  In  practice  assessed  as  an  Indifference  probability  In  the  stand- 
ard gamble  which  yields  consequence  BEST  (defined  as  all  attributes  at 
their  best  levels)  with  probability  and  consequence  WORST  (all  attri- 
butes at  their  worst  levels)  with  probability  1 - k^,  versus  the  certain 
consequence  with  attribute  at  its  best  level  and  all  other  attributes 
at  their  worst  levels.  Scaling  constants  can  also  be  assessed  In  other 
ways.  In  the  HOPE  procedure,  neither  the  utility  functions,  u^,  nor  the 
scaling  constants,  k^,  are  directly  assessed.  Rather,  the  HOPE  pro- 
cedure Infers  utility  functions  and  scaling  constants  from  holistic 
assessments  of  consequences  defined  by  an  appropriate  orthogonal  design 
as  described  below. 

Orthogonal  Arrays  Define  Consequences  for  Holistic  Assessment 

Orthogonal  experimental  designs  generally  require  the  lowest  number 
of  holistic  consequence  assessments  for  additive  main-effect  non-confounded 
parameter  estimation.  A catalog  of  useful  orthogonal  designs  and  varia- 
tions is  provided  by  Addelman  (1962). 

*The  qualitative  properties  of  preferential  and  utility  Independence 
are  described  succinctly  In  Keeney  (1977,  p.  271)  and  amplified  In 
Keeney  and  Ralffa  (1976,  Chap.  6). 
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An  example  of  a particular  design  appropriate  for  up  to  5 attributes 
with  up  to  4 levels  per  attribute  is  Addelman's  "Basic  Plan  3"  shown  in 
Table  1.  For  example,  consequence  3 in  Table  1 is  defined  as  level  1 of 
attribute  1,  level  3 of  attributes  2 and  3,  level  4 of  attribute  4,  and 
level  2 of  attribute  5.  This  consequence,  along  with  the  15  others  of 
Table  1,  plus  one  additional  consequence  defined  in  Appendix  1 and  used 
to  estimate  the  parameter  K,  would  then  be  holistically  evaluated.  Note 
that  consequence  1 represents  the  worst  level  of  each  attribute,  and 
would  be  assessed  to  have  zero  utility  in  the  preparation  phase.  A 
worst  reference  outcome  is  conraon  to  the  orthogonal  designs  on  which 
HOPE  is  based. 

Holistic  responses  may  be  either  direct  ratings,  appropriate  for 
riskless  utilities,  or  standard  gamble  Indifference  probabilities, 
appropriate  for  risky  utilities.  For  riskless  utility  assessment  one 
reference  case  consisting  of  the  best  level  of  all  attributes  is  as- 
signed 100  points;  a second  reference  case  defined  as  the  worst  level  of 
all  attributes  is  assigned  0 points.  The  remaining  consequences  defined 
by  the  experimental  design  are  then  rated  individually  along  the  0 to 
100  point  scale.  Rating  data,  normalized  by  dividing  by  100,  are  then 
treated  as  if  they  were  interval-scaled  responses.  For  lottery-type 
utility  assessments  the  design  consequences  could  be  considered  as  sure- 
thlng  consequences  in  a standard  gamble  with  the  same  two  reference 
cases  as  uncertain  outcomes.  Related  riskless  procedures  using  6-to-ll 
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point  direct  rating  scales  have  been  used  in  marketing  applications 
such  as  hospital  promotion  (Wind  and  Spitz,  1976). 

Analysis  (Arithmetic)  of  Holistic  Responses 

Succinctly  stated,  the  HOPE  procedure  receives  m noisy  holistic 
judgments  as  input.  Using  2 judgments,  the  parameter  K is  estimated 
(step  1,  Appendix  2).  Depending  on  the  value  of  K,  either  m - 1 judg- 
ments are  subjected  to  the  arithmetic  of  the  additive  model  (steps  2-5 
of  Appendix  2),  or  m - 1 transformed  judgments  are  subjected  to  the 
arithmetic  of  the  multiplicative  model  (steps  6-10  of  Appendix  2).  The 
result  or  output  is  a complete  set  of  and  u^(x^)  values,  the  latter 
for  each  level  of  each  attribute  specified  by  the  design. 

The  analysis  of  the  holistic  evaluations  is  made  extremely  simple 
by  the  use  of  an  orthogonal  design.  It  merely  involves  arithnetlc.  The 
main  effect  of  any  given  level  of  any  given  attribute  is  then  the  sum  of 
the  holistic  values  of  all  consequences  containing  the  attribute  at  the 
given  level  (divided  by  the  number  of  such  consequences),  minus  the 
corresponding  sum  of  the  holistic  values  of  all  consequences  containing 
the  worst  level  of  the  same  given  attribute  (again  divided  by  the  number 
of  such  consequences).  If  the  procedure  described  in  Appendix  1 yields 
an  estimated  K ■ 0,  then  the  additive  model,  equation  (2),  is  appro- 
priate, and  the  estimated  main  effects  must  be  normalized  by  dividing 
each  estimate  by  the  sum  of  the  estimated  best  levels.  For  each  attri- 
bute, this  procedure  estimates  k^u^(x^)  in  general  and  k^  when  is  at 

* 

Carmone,  Green  and  Jain  (1977)  cite  a figure  of  over  200  indus- 
trial applications  of  additive  (conjoint  measurement)  models  ranging 
from  use  of  full  factorials  and  ranking  responses  to  use  of  orthogonal 
arrays  and  rating  responses.  These  applications  assume  equation  (2)  to 
be  an  appropriate  model. 
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Its  best  level  since  0 < < 1.  Otherwise,  the  multiplicative 

model,  equation  (1),  Is  appropriate.  In  this  case  each  holistic  value, 

U,  is  first  transformed  to  ln(l  + KU),  where  In  Is  the  natural  log- 
arithm. The  above  analysis  without  normalization  Is  then  simply  carried 
out  on  the  transformed  holistic  values.  For  the  multiplicative  model 
this  procedure  estimates  the  quantity  ln(l  + Kk^u^(x^))  In  general,  and 
ln(l  + Kk^)  when  is  at  its  best  level.  Since  K is  estimated  sepa- 
rately It  Is  then  possible  to  compute  all  and  u^(x^)  values. 

II.  Error  in  Assessed  Utility  Functions 

Assuming  the  general  multiplicative  model  i^  an  appropriate  repre- 
sentation of  the  basic  preference  structure,  error  can  occur  In  the 
direct  estimation  of  the  scaling  constants  and  utility  functions  for 
decomposition  methods  or  In  the  holistic  assessments  for  holistic  methods. 
The  individual  estimates  may  be  merely  noisy,  or  may  be  fundamentally 
incorrect.  In  predicting  utilities,  the  analyst  may  further  mis-specify 
the  model  (e.g.,  may  choose  equation  (2)  rather  than  equation  (1)). 

Thus,  prediction  errors  may  be  related  to  one  or  more  of  the  following 
errors:  (1)  model  specification  error;  (2)  noisy  subjective  estimates; 

(3)  substantial  random  error. 

It  Is  difficult  to  characterize  substantial  errors  of  judgment. 
Surely,  a substantial  error  Is  a judgment  which  would  be  altered  upon 
reflection.  It  would  exhibit  low  test-retest  reliability.  In  a statis- 
tical sense  a substantial  error  Is  an  outlier. 

Examples  of  the  three  types  of  error  occurred  In  my  recent  field 
study  of  professional  audit  judgments.  In  that  study  audit  partners 
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considered  hypothetical  client  firms  described  by  sets  of  financial 
statements.  For  each  firm  a dollar  amount  defining  material  error  was 
estimated  to  guide  statistical  sampling  procedures.  Some  dollar  amounts 
were  used  to  estimate  the  parameters  of  a judgment  model;  the  remaining 
estimates  constituted  a holdout  sample  to  be  predicted  from  the  model. 
Several  sources  or  error  could  have  contributed  to  prediction  error. 

First  the  assumed  additive  prediction  model  could  have  been  an  incorrect 
specification.  Second,  the  original  estimates  were  stated  to  the  nearest 
$25,000.  With  this  type  of  rounding,  a “true"  value  of  say,  $190,000 
would  sometimes  have  been  assessed  as  $175,000  and  sometimes  as  $200,000. 
Or  sometimes,  the  estimate  would  have  been  "either  $175,000  or  $200,000." 
Third,  any  of  the  judgments  to  be  predicted  or  the  judgments  used  to 
estimate  the  parameters  could  have  been  either  merely  noisy  or  substan- 
tially wrong.  Test-retest  cases  and  deviations  from  additive  predic- 
tions Indicated  the  general  level  of  noise.  In  Instances  where  there 
was  substantial  prediction  error,  participants  were  asked  to  reexamine 
cases  whose  estimated  dollar  amounts  either  influenced  parameter  esti- 
mates or  represented  holdout  cases.  For  many  firms  the  original  dollar 
amount  was  deemed  "correct,"  but  in  a few  cases  judged  dollar  amounts 
were  substantially  revised.  Comments  like  “I  don't  know  what  I could 
have  been  thinking,"  accompanied  such  revisions.  Yet  clearly  there  was 
no  definitive  criterion  for  "noise"  versus  "substantial  error." 

In  the  remainder  of  this  section  we  examine  the  four  elicitation 
procedures,  designated  KR  (for  Keeney-Raiffa) , SMART,  SJT  (for  social 
judgment  theory)  and  HOPE,  in  conjunction  with  the  possible  effect  on 
each  of  specification  error,  noise,  and  substantial  random  error. 
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Model  Specification  Error 

All  four  methods  estimate  univariate  utility  functions  and  scaling 
constants  for  either  the  additive  model  equation  (2),  or  the  multiplica- 
tive model  equation  (1).  SMART  and  SJT  consider  only  additive  models. 
Thus,  a priori,  SMART  and  SJT  models  are  Incorrectly  specified  for  all 
true  values  of  the  multiplicative  parameter  K 0. 

There  are  several  studies  which  Indicate  the  effect  of  specifica- 
tion error  within  elicitation  methods.  Fischer  (1977)  observed  high 
correlations  (median  ■ .982)  between  KR  additive  predictions  and  KR 
multiplicative  conditions.  Simulating  the  HOPE  procedure  using  known 
multiplicative  models  with  extreme  K values  (K  ■ -.94  or  K ■ 4.11)  with 
two  levels  of  response  noise  (normally  distributed,  mean  0 and  standard 
deviations  of  .025  and  .05),  Barron  and  Person  (1978)  found  larger 
errors  of  predictions  for  Incorrectly  specified  additive  models  (with  or 
without  noise)  than  for  correctly  specified  models  with  noise.  Further- 
more, as  the  standard  deviation  of  response  error  was  decreased  from  .05 
to  .025  prediction  error  for  correctly  specified  multiplicative  models 
was  cut  in  half,  while  for  Incorrectly  specified  additive  models  with 
noise,  prediction  error  persisted  at  80Z  and  90Z  of  Its  former  level. 
Analysis  of  Fischer's  risky  data  by  HOPE  procedures  shows  for  8 of  10 
subjects  multiplicative  models  produce  a lower  root  mean  square  error  in 
predicting  the  entire  set  of  holistic  Judgments  than  do  additive  models. 
A statistical  analysis  of  the  same  risky  data  by  Fischer  found  6 of  10 
subjects  departed  significantly  (a  ■ .05)  from  additivity  and  for  each 
of  the  6 cases,  multiplicative  models  produced  a lower  standard  error  of 
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estimate.  Finally,  all  30  estimates  of  K (3  estimates  for  each  of 
Fischer's  10  risky  data  sets)  using  the  procedure  described  In  Appendix 
1 were  found  to  be  different  from  0. 

Analysis  of  the  laboratory  data  of  Fischer  and  the  simulation  data 
of  Barron  and  Person  suggest  that  model  specification  error  Increases 
prediction  error.  The  HOPE  method  provides  a simple  direct  estimate  of 
K The  KR  method  computes  K as  a function  of  the  scaling  constants, 
while  SMART  and  SJT  work  directly  with  relative  weights  assuming  K * 0. 

The  Individual  methods  differ  in  their  respective  approaches  to 
specification  of  the  univariate  utility  functions,  u^(*).  KR  and  SMART 
assess  the  u^  directly.  SMART  asks  respondents  to  assign  values 

directly  to  selected  levels  of  attribute  1,  1 » 1,  2,  . . . , n.  KR 
assess  a univariate  utility  via  standard  gambles  — finding  a few  cer- 
tain equivalents,  followed  by  fairing  In  a curve.  Direct  assessment  of 
certain  equivalents  is  especially  vulnerable  to  Tversky's  (1977)  sug- 
gested certainty  effect  bias. 

The  HOPE  and  SJT  methods  Infer  the  u^  statistically  from  the  holis- 
tic responses.  SJT  uses  multiple  linear  regression;  candidate  u^  func- 
tions are  polynomial  functions  of  . The  ability  of  linear  models  to 
account  for  substantial  proportions  of  variance  (e.g.,  Yntema  and  Torgerson, 
1961;  Slovlc  and  Lichtenstein,  1971)  often  leads  to  u^  functions  which 
are  linear  In  x^.  The  HOPE  procedure  provides  a single  point  estimate 
for  each  x^  specified  by  the  design.  Since  there  are  no  de- 

grees of  freedom  for  error  estimation  in  the  single  design  HOPE  pro- 
cedure, a substantial  error  In  a single  holistic  judgment  could  produce 
erroneous  u^  functions. 


- 16  - 


With  respect  to  weights,  and  more  generally,  scaling  constants, 
SMART  directly  assesses  relative  weights  which  are  then  normalized;  KR 
assesses  scaling  constants  directly  via  standard  gambles  or  sets  of 
equations  reflecting  specific  tradeoffs  (Keeney  and  Ralffa,  p.  267); 

HOPE  Infers  scaling  constants  consistent  with  the  ex  ante  model  speci- 
fication; and  SJT  Infers  beta  weights  fro*::  the  regression.  SJT  weights 
are  Influenced  by  both  model  specification  error  (additive  only)  and 
univariate  utility  function  specification  error  (usually  linear). 

Tradeoffs  between  judgment  error  and  modeling  error  are  complex. 
SMART  weights  are  simpler  to  assess  than  are  KR  scaling  constants. 

Simpler  judgments  (relative  weights)  may  outweigh  the  disadvantage  of 
model  specification  error  (assuming  K * 0).  By  contrast,  SJT  and  HOPE 
procedures  use  equivalent  holistic  judgments.  In  this  case  assuming  an 
additive  model,  i.e.,  K « 0,  serves  as  a constraint. 

Noise 

KR  and  SMART  procedures  estimate  scaling  constants  or  weights  and 
univariate  utility  functions  separately.  Thus  noise  in  the  estimate  of 
one  set  of  parameters  should  not  effect  estimates  of  the  other  set.  Of 
course,  the  values  of  the  scaling  constants  depend  on  the  attribute 
ranges.  SMART  weights  are  normalized  by  the  sum  of  the  estimated  values. 
Thus  SMART  estimates  are  sensitive  only  to  errors  in  the  relative  values 
of  these  estimates.  Keeney  (1977,  p.  284  ff)  first  ranks  attribute 
(ranges)  by  Importance  and  then  assesses  specific  values  via  standard 
gambles  and/or  tradeoffs.  Standard  gambles  will  underestimate  scaling 
constants  to  the  extent  they  are  subject  to  Tversky's  certainty  effect. 


Using  direct  tradeoffs,  noise  In  the  univariate  utility  function  values 
will  lead  to  noise  In  scaling  constant  values. 

SJT  Infers  the  scaling  constants  assuming  linear  utility  functions 
via  regressing  standardized  attribute  levels  against  holistic  judgments. 
The  assumption  of  linear  conditional  utility  functions  Introduces  error; 
a second  source  of  error  Is  noise  In  the  holistic  judgments.  The  usual 
error  theory  of  regression  analysis  provides  estimates  of  the  sensitiv- 
ity of  the  inferred  scaling  constants  (beta  weights). 

HOPE  simultaneously  Infers  conditional  utility  functions  and  scal- 
ing constants  from  holistic  judgments.  The  orthogonal  arrays  utilized 
by  this  procedure  are  extremely  efficient  regression  designs.  There 
are,  in  fact,  zero  degrees  of  freedom.  If  the  value  of  a single  holis- 
tic judgment  Is  changed,  the  estimated  value  of  at  least  one  scaling 
constant,  and  several  u^  values  will  change.  If  the  holistic  judgment 
is  merely  noisy,  the  effect  on  and  u^(»)  are  minimal;  if  the  judgment 
reflects  "substantial  random  error,"  the  effects  will  be  substantial. 
Each  of  these  last  two  statements  finds  support  In  the  individual  simu- 
lation runs  of  Barron  and  Person  (1978). 

Substantial  Random  Error 

"Holistic  evaluative  judgments  are  characterized  by  a substantial 
degree  of  random  error,"  states  Fischer  (1977,  p.  296).  If  so,  the  SJT, 
HOPE,  and  KR  assessments  are  affected.  SJT  and  HOPE  use  only  holistic 
assessments  as  data;  KR  assessments  of  scaling  constants  in  a standard 
gamble  context  require  a holistic  assessment  of  outcomes  having  one 
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attribute  at  Its  best  level  and  all  other  attributes  at  their  worst 
levels.  In  this  latter  case,  Tversky's  certain  effect  bias  is  con- 
founded with  the  predicted  "substantial  degree  of  random  error." 

Judgment  error  per  se  is  not  a part  of  the  formal  theory  of  MAD. 
Error  may  be  handled  implicitly  via  sensitivity  analysis.  In  the  as- 
sessment stage,  the  careful  analyst  performs  numerous  consistency  checks 
in  an  attempt  to  prevent  substantial  error.  For  example,  in  SMART,  if 
the  relative  values  of  attribute  ranges  I,  II,  and  III  are  10,  30,  and 
60  respectively,  i.e.,  II  is  3 times  as  Important  as  I,  while  III  is  6 
times  as  Important  as  I;  then  as  a consistency  check.  III  should  be 
judged  2 times  as  Important  as  II.  For  KR  procedures,  numerous  con- 
sistency checks  are  illustrated  in  a detailed  protocol  (Keeney,  1977). 

The  most  serious  error  suggested  thus  far  is  a single  holistic 
judgment  exhibiting  a substantial  degree  of  error  being  used  in  a HOPE 
estimation  procedure.  Since  there  are  zero  degrees  of  freedom,  the 
error  affects  several  k^  or  u^  values  individually.  Since  the  error  is 
substantial  the  Impact,  though  moderated  by  other  judgments,  is  also 
substantial.  Thus,  it  is  essential  to  identify,  if  possible,  individual 
holistic  judgments  exhibiting  substantial  random  error. 

In  the  next  section  a modified  HOPE  procedure  is  described.  The 
modified  procedure  guides  consistency  checks  ove'.*  the  original  set  of 
holistic  judgments  and  provides  a means  for  detecting  substantial  random 
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III.  Detection  of  Error  in  Holistic  Judgaents 

The  HOPE  procedure  can  be  easily  extended  Co  detect  substantial 
random  error  over  Che  set  of  holistic  judgments.  The  extended  HOPE 
procedure  uses  two  orthogonal  designs  with  minimal  overlap  in  one  of  two 
possible  ways.  First,  the  data  ^f  both  designs  is  pooled  Co  estimate 
and  values.  This  serves  to  increase  Che  degrees  of  freedom  in 

the  HOPE  procedure  Co  approximately  the  number  of  design  points  unique 
to  one  design.  Noise  is  Chen  defined  by  the  root  mean  square  error, 
i.e.,  of  pooled  prediction  minus  actual  judgments,  while  arbitrarily, 
substantial  error  is  defined  to  be  a deviation  exceeding  twice  Che  root 
mean  square  error. 

A second  approach  using  two  designs  is  Co  build  two  separate 
(utility)  prediction  laodels  ~ one  for  each  design.  The  utility  pre- 
dictions of  design  1 (2)  are  used  to  predict  Che  actual  judgments  of 
design  2 (1).  Substantial  error  is  again  arbitrarily  defined  by  pre- 
diction errors  exceeding  twice  the  root  mean  square  error. 

Either  of  the  above  procedures  can  be  used  to  "detect"  excessive 
deviations  from  model  predictions,  although  it  may  be  more  reasonable  to 
consider  these  procedures  as  guides  for  checking  consistency  over  the 
original  set  of  holistic  judgments.  One  sec  of  judgments  which  are 
candidates  for  consistency  checks  are  those  judgments  which  deviate 
substantially  from  predicted  values.  A second  sec  of  candidates  are 
equal  judgments  for  which  the  predicted  values  diverge.  For  example,  if 
two  consequences,  a and  b,  have  judged  utilities  u(a)  ■ u(b)  ■ .6  but 
the  predictions  differ;  say  u'(a)  - .64  and  u' (b)  ■ .57;  then  a and  b 
may  also  be  presented  to  the  subject  for  reconsideration. 
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Each  error  detection  procedure  can  be  Illustrated  using  the  data  of 
Fischer  (1977).  Fischer's  subjects  provided  holistic  evaluations  for  27 
hypothetical  offers  of  employnent.  Each  Job  was  (completely)  described 
by  3 attributes  — salary,  location  (city),  and  type  of  work.  Each 
attribute  had  3 possible  levels,  designated  in  the  tables  and  discussion 
as  W,  for  worst  level;  I,  for  intermediate  level;  and  B,  for  best  level. 
The  27  Jobs  represent  all  combinations  of  each  level  of  each  attribute. 
Fischer  (1977)  provides  a detailed  description  of  the  elicitation  task. 
These  data  have  been  analyzed  via  conjoint  measurement  (Fischer,  1976), 
have  been  subjected  to  convergent  validation  tests  in  conjunction  with 
several  elicitation  methods  (Fischer,  1977),  and  have  been  subjected  to 
various  HOPE  procedures  (Barron,  1978).  For  illustrative  purposes,  we 
consider  the  (risky)  responses  of  subject  2,  a subject  for  whom  the 
additivity  hopotheses  (i.e.,  equations  1 and  2)  were  rejected  by  con- 
joint measurement  analysis  (Fischer,  1976,  p.  139). 

A double  design  appropriate  for  the  analysis  of  subject  2*5  re- 
sponses is  presented  in  Table  2.  Predicted  values  for  each  design  are 
based  on  pooling  data  from  both  designs.  Design  1 indicates  two  points 
for  consistency  checks  (I,U,I)  and  (B,I,W).  Point  (I,W,I)  is  also  a 
design  2 point;  except  for  (I,W,I),  design  2 has  no  points  Indicated  for 
consistency  checks.  These  two  points  should  now  be  reconsidered  by  the 
person  making  the  original  Judgments.  (Of  course,  using  these  data 
supplied  by  Fischer,  this  is  impossible.) 

At  this  point,  let  us  assume  that  upon  reconsideration  subject  2 
agreed  to  reduce  the  Judged  (B,I,U)  from  .61'  to  .50.  If  this  one  change 
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is  made,  and  Che  predictive  model  for  Che  revised  pooled  data  is  obtained, 
several  things  happen  as  shown  in  Table  3.  The  new  predictions  for 
(B,I,W)  and  (I,U,I)  are  both  brought  into  better  agreement  with  stated 
values  — (B,I,W)  because  Che  stated  value  was  revised  and  (I.W,I)  be- 
cause the  predicted  value  based  on  the  revised  (B,I,U)  value  increased. 
All  values  are  brought  into  better  agreement  as  measured  by  root  mean 
square  error  (RMSE).  Revised  RMSE  over  all  16  pairs  is  smaller  than  the 
prior  RMSE  either  including  or  excluding  the  (B,I,U)  pair. 

Consistency  checks  should  also  be  performed  for  sets  of  equal  Judg- 
ments as  guided  by  model  predictions.  In  design  2 cells  (B,I,I)  and 
(B,B,U)  were  each  assigned  a value  of  .65.  The  predicted  values  are  .70 
for  (B,I,I)  and  .39  for  (B,B,W).  The  subject  should  be  asked  to  recon- 
sider these  Judgments.  Is  (B,I,I)  really  preferred  Co  (B,B,U)?  A 
similar  comment  does  not  apply  to  design  2 cells  (W,I,B)  and  (W,B,I), 
each  assigned  original  values  of  .70.  The  predicted  values  differ,  .66 
and  .69  respectively,  but  do  not  diverge. 

Using  Che  two  designs  to  estimate  separate  predictive  models  pro- 
duces similar  results.  Cells  (I,U,I)  and  (B,I,W)  of  design  1 are  iden- 
tified for  consistency  checks.  Revising  the  (B,I,W)  utility  suggests 
the  original  (I,W,1)  value  is  not  substantially  wrong.  Furthermore, 
design  1 predictions  of  the  equal  value  cells  (B,I,I)  and  (B,B,W)  sug- 
gest the  former's  utility  should  be  increased  and  the  latter's  should  be 
decreased. 

The  use  of  double  designs  provides  a mechanism  for  detecting  sub- 
stantial error  in  Judgment  and  further  guiding  consistency  checks.  The 
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error  detection  feature  coaea  at  the  coat  of  requiring  approxlautely 
twice  aa  w;ny  Judgaenta  aa  originel.ly  required.  Ita  advantage  ia  that 
it  providea  a within-elicitation  aMthod  for  convergent  validation  of 
field  aaaeaaed  utility  Judgaenta. 

IV.  Application  Conaiderationa 

Proponenta  of  each  elicitation  method  deacribed  can  point  to  an 
iapreaalve  nuaber  of  applications.  Illustrative  of  KR  are  Keeney  (1973, 
1976,  1977);  of  SMART,  Edwards  (1977),  Gardiner  and  Edwards  (1975);  of 
SJT,  Haaannd  and  Adelaan  (1976)  and  of  HOPE,  Wind  and  Spitz  (1976),  and 
those  cited  in  Caraone  et  al.  (1977).  This  section  will  provide  neither 
an  extensive  analysis  of  each  Mthod  nor  an  applications  critique  of 
each  method.  Instead,  practical  considerations  will  be  highlighted. 

Keeney  has  reported  two  applications,  Keeney  (1976,  1977)  that 
provide  special  insight  into  the  process  of  elicitation  Itself  » a 
process  ifhich  is  intensive,  demanding  and  dynamic.  In  each  case, 
respondents  are  highly  activated  professionals  who  had  thought  deeply 
about  the  respective  problesu,  and  Keeney  is  an  especially  skillful 
assessor.  Generalizing  from  these  reported  applications,  a KR  procedure 
is  expensive  in  terms  of  both  respondent  and  assessor  time.  It  is  also 
an  intensive  process  requiring  a rather  skilled  assessor.  Respondents 
are  often  required  to  be  conversant  with  probability  concepts.  There  is 
no  estlaatc  of  "noise,"  and  no  particular  procedure  guides  consistency 
checks.  When  the  assessment  is  finished,  the  result  is  a utility  func- 
tion. Random  error  may  implicitly  be  considered  via  sensitivity  analyses. 
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The  procedure's  advantages  are  its  obvious  tie  to  underlying  theory,  its 
attention  to  model  specification,  and  the  likely  independence  of  noise 
in  the  scaling  constants  and  univariate  utility  functions.  Its  disad- 
vantages are  the  requirements  that  possibly  unfamiliar  constructs  (scal- 
ing constants  and  univariate  utility  functions)  must  be  directly  assessed 
in  a possibly  unfamiliar  language  (probability  theory). 

The  primary  advantage  of  the  SMART  procedure  is  its  simplicity. 

Based  on  simple  rating  procedures  to  deduce  weights  and  utility  func- 
tions, it  has  the  further  advantage  of  being  easy  to  teach  to  (proba- 
bilistically) naive  decision  makers.  It  is  easily  adapted  to  hierarch- 
ical utility  structures.  Although  the  procedure  is  believed  to  be 
robust,  a disadvantage  is  its  sole  reliance  on  the  additive  model.  An 
error  detection  procedure  for  relative  weights  consists  of  a triangular 
matrix  of  ratios  as  described  earlier  in  this  report. 

SJT  and  HOPE  procedures  rely  on  holistic  judgments.  It  is  gen- 
erally acknowledged  that  people  are  capable  of  assigning  values  holis- 
tically in  a consistent  and  meaningful  fashion  provided  the  number  of 
attributes  is,  say,  less  than  10  and  the  total  number  of  evaluations  is, 
say,  less  than  50,  although  occasional  judgments  are  subject  to  sub- 
stantial error.  Thus,  it  can  be  argued  that  each  procedure  uses  psy- 
chologically meaningful  and  familiar  judgments.  The  disadvantages  of 
SJT  include  reliance  on  the  additive  model,  equation  (2),  and  a tendency 
to  infer  linear  utility  functions.  There  is  also  the  difficulty  of 
detecting  error  in  individual  Judgments. 
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HOPE  has  been  characterized  as  a "deccmposition  procedure  %rhich 
relies  solely  on  holistic  assessments"  (Fischer,  personal  coasHinlca- 
tlon).  It  Includes  procedures  designed  to  both  aid  correct  model  spec- 
ification and  detect  substantial  error  In  Individual  Judgments.  Its 
obvious  limitations  are  that  It  requires  credible  orthogonal  attribute 
combinations  and  it  does  not  apply  to  value  hierarchies. 

The  HOPE  procedure  seems  to  have  certain  advantages  when  probing 
utilities  of  large  numbers  of  people.  For  example,  first  a few  repre- 
sentative Individuals  can  be  subjected  to  an  Intensive  process.  Here 
the  reasonableness  of  preferential  and  utility  Independence  properties 
can  be  checked  and  unimportant  attributes  can  be  discarded.  Next,  a 
larger  sample  of  respondents  can  be  assessed  via  questionnaire.  Sub- 
sequently questionnaire  assessments  could  be  checked  through  a small 
sample  of  follow-up  Interviews. 

The  HOPE  procedure  Is  an  extension  of  an  additive  conjoint  measure- 
ment approach  to  modeling  consumer  preferences  for  multi-attribute  al- 
ternatives (Green  and  Rao,  1971).  Early  applications  of  the  conjoint 
methodology,  e.g..  Green,  Carmone,  and  Wind  (1972),  relied  on  full  fac- 
torials, rank-order  responses,  and  nonmetrlc  scaling  procedures.  Green, 
et  al..  have  subsequently  simplified  the  basic  approach  considerably. 
Orthogonal  designs  significantly  reduce  the  number  of  consequences  to  be 
evaluated  from  a prohibitive  to  a feasible  number.  Direct  rating  of  a 
few  consequences  Is  often  perceived  as  less  tedious  than  ranking. 

Metric  recovery  requires  substantially  less  computational  capacity. 
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The  efficacy  of  these  simplifications  has  been  supported  In  the  recent 
simulation  studies  of  Carmone,  et  al.  (1977),  and  Barron  and  Person 
(1978).  That  there  have  been  over  200  Industrial  applications  of 
conjoint  analysis  (using  both  full  factorials  and,  more  recently,  or- 
thogonal arrays)  attest  to  HOPE* a practicability.  The  refinements  of 
correct  model  specifications  and  error  detection  can  only  enhance  its 
usefulness. 

Having  practical  alternatives  to  KR  and  SMART  provides  obvious 
advantages.  Assessment  can  be  more  easily  tailored  to  the  specific 
situation  — considering  costs,  nature  of  respondents,  and  importance  of 
the  contemplated  decision.  The  particular  strengths  and  weaknesses  of 
different  procedures  can  be  further  determined  through  actual  practice 
and  empirical  research. 


Appendix  1:  Estiaatlon  of  the  Peraaeter  K 


Consequence  13  (Table  1)  denoted  below  by  "a,"  consists  of  the 
highest  levels  of  attributes  1,  3,  4,  5 and  the  lowest  level  of  attri- 
bute 2.  If  one  additional  consequence,  denoted  "b,"  Is  defined  to  have 
attribute  2 at  Its  highest  level  and  attributes  1,  3,  4,  5 at  their 
lowest  levels  and  la  Included  with  the  orthogonal  design,  then  ratings 
of  these  coaplestentary  consequences  a and  b can  be  used  to  estlaate  K. 

If  the  MAD  Is  additive,  then  the  sua  of  the  observed  utilities  for 
the  coapleswntary  consequences  Is  1.0.  Otherwise  define  consequence  "c" 
to  have  all  attributes  at  their  highest  levels.  By  regrouping  teras 
froa  the  right  side  of  equation  (1),  we  have  the  following: 


1 + k u(c) 

- (1  + Kk^)  • (1  + K u(a)) 

(3) 

*^2 

■ u(b) 

(4) 

u(c) 

- 1 

(5) 

Substituting  (4)  and  (5)  Into  (3)  and  solving  gives 

K - (1  - u(a)  - u(b))/(u(a)  u(b)). 

In  practice,  the  ratings  u(a)  and  u(b)  are  estlaates,  so  K Is  also  an 
estlsMte. 


1 
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Appendix  2:  Illustration  of  the  Arlthnetlc 


of  the  HOPE  Procedure 


Assuae  the  decision  maker  has  provided  holistic  assessments  for  the 
16  consequences  of  Table  1.  Designate  these  values  h^,  h2i  •••> 
Designate  by  h^^  assessment  of  consequence  (1,  4,  1,  1,  1),  that  Is, 
the  consequence  having  attribute  2 at  Its  best  level  and  all  other 
attributes  at  their  worst  levels.  The  arithmetic  proceeds  as  follows: 

Step  1.  Following  Appendix  1,  compute  hj^^  + hj^^* 

then  K > 0 and  we  estimate  the  parameters  of  the  additive 
model  (steps  2-5).  If  hj^^  + hj^^  ^ proceed  to  step  6 to 
estimate  the  parameters  of  the  multiplicative  model. 


Step  2.  Aj^  ■ (hj^  + hj  + hj  + h^)/4 

A2  - (hj  + h^  + h^  + hg)/4 


Aj  - (h^  + hj^Q  + hj^j^  + 

\ “ ^*^13  ^ **14  ^15  ^16^^^ 


Step  3.  kj^  « A^  - Aj^ 

kj^Uj^  (level  3 of  Xj^)  • A^  - Aj^ 
kj^Uj^  (level  2 of  Xj^)  " A2  “ Aj^ 

Step  4.  Repeat  steps  2 and  3 for  each  attribute.  Note  the  definition 
of  A^,  A2,  A^,  A^  changes  for  each  attribute,  as  defined  by 
the  design.  For  example,  for  attribute  2 
Ai  - (h^  + hj  + h^  + h^3)/4 

^2  " ^**2  **6  *'10  **14^^^ 
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Step  5.  Conpute  Che  sun  4-  . . . -f  k^.  If  this  sum  does  not 

equal  one,  then  normalize  all  values  computed  In  step  3 for 
each  attribute  by  dividing  each  value  by  this  sum. 

Step  5 completes  the  estimation  process  for  the  additive  model.  Steps 

6-10  describe  the  estimation  process  for  a multiplicative  model. 

Step  6.  First  calculate  K',  an  estimate  of  K,  using  the  following 
relationship  derived  in  Appendix  1: 

K'  - (1  - h^3  - h3^)/hj^3h^y) 

Step  7.  Define  h^  - ln(l  + K'h^),  i - 1,  2 16 

Step  8.  Repeat  step  2 (additive  model)  using  hj^  values. 

Step  9.  1 + K'k^  - e^^4  " ^1^ 

1 + **“'*‘l'*l  (level  3 of  Xj^)  ■ e^^3  ’ ^1^ 

1 + (level  2 of  Xj^)  - e^2  “ ^1^ 

Step  10.  Repeat  steps  8 and  9 for  each  attribute.  As  before,  the 
definition  of  A^,  A2,  A^,  A^  are  giver,  by  the  design  and 
differ  for  each  attribute. 
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Table  1 


Attribute  Levels  for  a 4 


5 


Orthogonal  Design 


Consequence 

Attributes 

1 

2 

3 

4 

5 

1 

* 

1 

1 

1 

1 

1 

2 

1 

2 

2 

3 

4 

3 

1 

3 

3 

4 

2 

4 

1 

4 

4 

2 

3 

5 

2 

1 

2 

2 

2 

6 

2 

2 

3 

4 

3 

7 

2 

3 

4 

3 

1 

8 

2 

4 

1 

1 

4 

9 

3 

1 

3 

3 

3 

10 

3 

2 

4 

1 

2 

11 

3 

3 

1 

2 

4 

12 

3 

4 

2 

4 

1 

13 

4 

1 

4 

4 

4 

14 

4 

2 

1 

2 

1 

13 

4 

3 

?. 

1 

3 

16 

4 

4 

3 

3 

2 

* 

Level  1 is  worst; 

level 

4 is 

best. 
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Table  2 

ACCrlbuCe  Level  Coablnations,  Stated  Holistic  Judgaents 
and  Predicted  Judgaents  for  TVo  Orthogonal  Designs 


Coablnatlon 

1 

3 

Stated 

Value 

Predicted 

Value 

Design  2 

1 2 3 

Stated 

Value 

Predicted 

Value 

1 

W 

W 

U 

.00 

.00 

U 

u 

U 

.00 

.00 

2 

U 

I 

I 

.60 

.57 

I 

w 

I 

.47 

.38** 

3 

W 

B 

B 

.85 

.78 

B 

u 

B 

.55 

.55 

4 

I 

U 

I 

.47 

.38** 

I 

I 

U 

.39 

.40 

5 

I 

I 

B 

.75 

.73 

B 

I 

I 

.65 

.70 

6 

I 

B 

W 

.50 

.53 

U 

I 

B 

.70 

. 66 

7 

B 

U 

B 

.55 

.55 

B 

B 

W 

.65 

.59 

8 

B 

I 

U 

.60 

.46** 

W 

B 

I 

.70 

.69 

9 

B 

B 

I 

.82 

.89 

I 

B 

B 

.85 

.84 

10 

W 

B 

U 

.60 

w 

B 

W 

.60 

Attributes  are  designated  1,  2,  3.  Levels  are  denoted  W,  for  worst; 
I,  for  Interaedlate;  B,  for  best.  Paraaeter  K Is  estlaated  using  coablna- 
tlon  10. 

** 

Stated  value  differs  froa  predicted  value  by  at  least  .077,  twice 
the  estlaated  root  aean  square  error. 
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Table  3 

Stated  Holistic  Judgnents,  Original  Predictions  and 
Revised  Predictions  for  TWo  Orthgonal  Designs 


Coablnatlon 

Design  1 

Design  2 

Stated 

Value 

Revised 

Prediction 

Original 

Prediction 

Stated 

Value 

Revised 

Prediction 

Original 

Prediction 

1 

.00 

.00 

.00 

.00 

.00 

.00 

2 

.60 

.57 

.57 

.47 

.40 

.38 

3 

.85 

.79 

.78 

.55 

.55 

.55 

4 

.47 

.40 

.38 

.39 

.38 

.40 

5 

.75 

.73 

.73 

.65 

.68 

.70 

6 

.50 

.53 

.53 

.70 

.66 

.66 

7 

.55 

.55 

.55 

.65 

.57 

.59 

8 

.50* 

.43 

.46 

.70 

.70 

.69 

9 

.82 

.81 

.89 

.85 

.85 

.84 

Revised  froa  .60  by  assuaptlon. 
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^In  this  report  an  approach  to  the  concept  of  error  in  utility 
assessment  is  proposed.  Three  components  of  error  are  considered 
and  each  component  is  related  to  four  separate  elicitation  methods - 
all  in  the  context  of  a general  multiplicative  multiattribute 
utility  model.  The  methods  are  Keeney*Raiffa  (1976)  procedure, 
SMART  (Edwards,  1977),  a social  j^udgment  theory  (SJT)  based  re- 
gression model  (Hammond,  Stewart,  Brehmer  and  Steinmann,  197S)  and" 
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a new  method  called  Holistic  Orthogonal  Parameter  Estimation  or 
HOPE  (Barron  and  Person,  19787.  ~ 

If  a general  multiplicative  model  can  be  assumed  to  be  an  appro- 
priate representation  of  the  decision  maker's  basic  preference 
structure,  error  can  occur  in  the  direct  estimation  of  die  scal- 
ing constants  and  univariate  utility  functions  for  decomposition 
methods  (Keeney-Raiffa  and  SMART),  or  in  the  holistic  assessments 
for  holistic  methods  (SJT  and  HOPE).  Individual  estimates  may  be 
merely  noisy  or  may  be  fundamentally  incorrect.  Furthermore,  the 
utility  model  may  be  incorrectly  specified;  for  example,  an  addi- 
tive model,  rather  than  a multiplicative  model,  may  be  used.  The 
four  assessment  methods  are  considered  in  conjunction  with  errors 
of  each  kind.  ^ 

The  most  serious,  error-method  combination  is  the  case  of  a sub- 
stantial degree  or  error  occurring  in  a single  holistic  judgment 
which  is  being  used  in  a HOPE  procedure.  This  concern  leads  to 
a major  emphasis  of  this  report--and  expanded  HOPE  procedure 
used  in  conjunction  with  a convergent  validation  strategy  to 
estimate  error  in  individual  holistic  judgments  and  thus  guide 
consistency  checks^ 

The  discussion  is  organized  into  four  sections.  The  HOPE  pro- 
cedure is  summarized  in  Section  I.  In  Section  II,  three  compo- 
nents of  assessment  error  are  considered  in  conjunction  with  the 
four  elicitation  procedures.  In  Section  III,  an  expanded  HOPE 
procedure  for  detecting  judgment  error  and  guiding  consistency 
checks  is  proposed.  In  Section  IV,  application  considerations 
are  outlined. 


